Overview

Dataset statistics

Number of variables18
Number of observations11286
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory144.0 B

Variable types

Numeric7
DateTime1
Categorical8
Text2

Alerts

country has constant value ""Constant
total_transaction_revenue is highly overall correlated with productsHigh correlation
total_hits is highly overall correlated with total_pageviews and 2 other fieldsHigh correlation
total_pageviews is highly overall correlated with total_hits and 2 other fieldsHigh correlation
total_time_on_site is highly overall correlated with total_hits and 1 other fieldsHigh correlation
products is highly overall correlated with total_transaction_revenue and 2 other fieldsHigh correlation
channel_grouping is highly overall correlated with traffic_sourceHigh correlation
traffic_source is highly overall correlated with channel_groupingHigh correlation
kmeans_cluster is highly overall correlated with agg_cluster and 1 other fieldsHigh correlation
agg_cluster is highly overall correlated with kmeans_cluster and 1 other fieldsHigh correlation
device_category is highly overall correlated with kmeans_cluster and 1 other fieldsHigh correlation
browser is highly imbalanced (81.1%)Imbalance
traffic_source is highly imbalanced (79.4%)Imbalance
kmeans_cluster is highly imbalanced (63.3%)Imbalance
dbscan_cluster is highly imbalanced (74.7%)Imbalance
agg_cluster is highly imbalanced (52.0%)Imbalance
device_category is highly imbalanced (69.7%)Imbalance
total_transaction_revenue is highly skewed (γ1 = 24.78989535)Skewed

Reproduction

Analysis started2024-10-01 19:34:24.841072
Analysis finished2024-10-01 19:34:30.873088
Duration6.03 seconds
Software versionydata-profiling vv4.6.1
Download configurationconfig.json

Variables

visitor_id
Real number (ℝ)

Distinct9502
Distinct (%)84.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5075969 × 1018
Minimum2.1313114 × 1014
Maximum9.998996 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.3 KiB
2024-10-02T00:34:30.965276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.1313114 × 1014
5-th percentile1.9855432 × 1017
Q11.646748 × 1018
median4.3876095 × 1018
Q37.1845698 × 1018
95-th percentile9.4477648 × 1018
Maximum9.998996 × 1018
Range9.9987829 × 1018
Interquartile range (IQR)5.5378218 × 1018

Descriptive statistics

Standard deviation3.0640877 × 1018
Coefficient of variation (CV)0.6797608
Kurtosis-1.2702596
Mean4.5075969 × 1018
Median Absolute Deviation (MAD)2.7608994 × 1018
Skewness0.13240198
Sum5.0872739 × 1022
Variance9.3886334 × 1036
MonotonicityIncreasing
2024-10-02T00:34:31.085507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.813149961 × 101835
 
0.3%
6.760732402 × 101825
 
0.2%
1.957458976 × 101822
 
0.2%
4.984366501 × 101817
 
0.2%
2.4025272 × 101815
 
0.1%
6.089151977 × 101714
 
0.1%
9.662800125 × 101812
 
0.1%
7.311242886 × 101812
 
0.1%
5.526675926 × 101812
 
0.1%
7.71301243 × 101811
 
0.1%
Other values (9492) 11111
98.4%
ValueCountFrequency (%)
2.131311426 × 10141
< 0.1%
4.353240613 × 10141
< 0.1%
5.62678147 × 10142
< 0.1%
5.85708896 × 10141
< 0.1%
8.528012638 × 10141
< 0.1%
1.123528056 × 10151
< 0.1%
1.905118576 × 10151
< 0.1%
2.527528149 × 10151
< 0.1%
2.709834583 × 10151
< 0.1%
2.838359589 × 10151
< 0.1%
ValueCountFrequency (%)
9.998996003 × 10181
< 0.1%
9.998597322 × 10181
< 0.1%
9.997409247 × 10181
< 0.1%
9.994767073 × 10181
< 0.1%
9.991633376 × 10181
< 0.1%
9.990797197 × 10181
< 0.1%
9.990183617 × 10182
< 0.1%
9.989795984 × 10181
< 0.1%
9.989256027 × 10181
< 0.1%
9.988700587 × 10181
< 0.1%
Distinct366
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
Minimum2016-08-01 00:00:00
Maximum2017-08-02 00:00:00
2024-10-02T00:34:31.196540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:31.309993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

total_transaction_revenue
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct6059
Distinct (%)53.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.469939 × 108
Minimum1200000
Maximum2.395256 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.3 KiB
2024-10-02T00:34:31.422897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1200000
5-th percentile14990000
Q129882500
median54755000
Q31.139325 × 108
95-th percentile5.227675 × 108
Maximum2.395256 × 1010
Range2.395136 × 1010
Interquartile range (IQR)84050000

Descriptive statistics

Standard deviation5.6179179 × 108
Coefficient of variation (CV)3.8218713
Kurtosis823.04591
Mean1.469939 × 108
Median Absolute Deviation (MAD)30775000
Skewness24.789895
Sum1.6589732 × 1012
Variance3.1561001 × 1017
MonotonicityNot monotonic
2024-10-02T00:34:31.540365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23990000 93
 
0.8%
24990000 86
 
0.8%
25990000 82
 
0.7%
22990000 80
 
0.7%
21990000 80
 
0.7%
19990000 68
 
0.6%
20990000 59
 
0.5%
18990000 57
 
0.5%
17990000 56
 
0.5%
26990000 49
 
0.4%
Other values (6049) 10576
93.7%
ValueCountFrequency (%)
1200000 1
 
< 0.1%
2040000 1
 
< 0.1%
2200000 1
 
< 0.1%
2490000 1
 
< 0.1%
2500000 1
 
< 0.1%
2990000 7
0.1%
3010000 1
 
< 0.1%
3200000 2
 
< 0.1%
3400000 1
 
< 0.1%
3500000 1
 
< 0.1%
ValueCountFrequency (%)
2.395256 × 10101
< 0.1%
2.31365 × 10101
< 0.1%
1.78595 × 10101
< 0.1%
1.603275 × 10101
< 0.1%
1.563461 × 10101
< 0.1%
1.466012 × 10101
< 0.1%
1.151181 × 10101
< 0.1%
1.090777 × 10101
< 0.1%
1.059514 × 10101
< 0.1%
8680830000 1
< 0.1%

channel_grouping
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
Referral
5354 
Organic Search
3182 
Direct
2021 
Paid Search
 
472
Display
 
150
Other values (3)
 
107

Length

Max length14
Median length11
Mean length9.4300018
Min length6

Characters and Unicode

Total characters106427
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowDirect
2nd rowReferral
3rd rowReferral
4th rowReferral
5th rowReferral

Common Values

ValueCountFrequency (%)
Referral 5354
47.4%
Organic Search 3182
28.2%
Direct 2021
 
17.9%
Paid Search 472
 
4.2%
Display 150
 
1.3%
Social 97
 
0.9%
Affiliates 9
 
0.1%
(Other) 1
 
< 0.1%

Length

2024-10-02T00:34:31.650813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-02T00:34:31.754658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
referral 5354
35.8%
search 3654
24.5%
organic 3182
21.3%
direct 2021
 
13.5%
paid 472
 
3.2%
display 150
 
1.0%
social 97
 
0.6%
affiliates 9
 
0.1%
other 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
r 19566
18.4%
e 16393
15.4%
a 12918
12.1%
c 8954
8.4%
i 5940
 
5.6%
l 5610
 
5.3%
f 5372
 
5.0%
R 5354
 
5.0%
S 3751
 
3.5%
h 3655
 
3.4%
Other values (15) 18914
17.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 87831
82.5%
Uppercase Letter 14940
 
14.0%
Space Separator 3654
 
3.4%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 19566
22.3%
e 16393
18.7%
a 12918
14.7%
c 8954
10.2%
i 5940
 
6.8%
l 5610
 
6.4%
f 5372
 
6.1%
h 3655
 
4.2%
n 3182
 
3.6%
g 3182
 
3.6%
Other values (6) 3059
 
3.5%
Uppercase Letter
ValueCountFrequency (%)
R 5354
35.8%
S 3751
25.1%
O 3183
21.3%
D 2171
14.5%
P 472
 
3.2%
A 9
 
0.1%
Space Separator
ValueCountFrequency (%)
3654
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 102771
96.6%
Common 3656
 
3.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 19566
19.0%
e 16393
16.0%
a 12918
12.6%
c 8954
8.7%
i 5940
 
5.8%
l 5610
 
5.5%
f 5372
 
5.2%
R 5354
 
5.2%
S 3751
 
3.6%
h 3655
 
3.6%
Other values (12) 15258
14.8%
Common
ValueCountFrequency (%)
3654
99.9%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106427
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 19566
18.4%
e 16393
15.4%
a 12918
12.1%
c 8954
8.4%
i 5940
 
5.6%
l 5610
 
5.3%
f 5372
 
5.0%
R 5354
 
5.0%
S 3751
 
3.5%
h 3655
 
3.4%
Other values (15) 18914
17.8%

browser
Categorical

IMBALANCE 

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
Chrome
10204 
Safari
 
725
Firefox
 
179
Internet Explorer
 
100
Edge
 
53
Other values (4)
 
25

Length

Max length17
Median length6
Mean length6.1181995
Min length4

Characters and Unicode

Total characters69050
Distinct characters32
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowChrome
2nd rowChrome
3rd rowChrome
4th rowChrome
5th rowChrome

Common Values

ValueCountFrequency (%)
Chrome 10204
90.4%
Safari 725
 
6.4%
Firefox 179
 
1.6%
Internet Explorer 100
 
0.9%
Edge 53
 
0.5%
Safari (in-app) 12
 
0.1%
Opera 6
 
0.1%
Android Webview 6
 
0.1%
Amazon Silk 1
 
< 0.1%

Length

2024-10-02T00:34:31.863300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-02T00:34:31.961348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
chrome 10204
89.5%
safari 737
 
6.5%
firefox 179
 
1.6%
internet 100
 
0.9%
explorer 100
 
0.9%
edge 53
 
0.5%
in-app 12
 
0.1%
opera 6
 
0.1%
android 6
 
0.1%
webview 6
 
0.1%
Other values (2) 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
r 11432
16.6%
e 10754
15.6%
o 10490
15.2%
m 10205
14.8%
C 10204
14.8%
h 10204
14.8%
a 1493
 
2.2%
i 941
 
1.4%
f 916
 
1.3%
S 738
 
1.1%
Other values (22) 1673
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 57502
83.3%
Uppercase Letter 11393
 
16.5%
Space Separator 119
 
0.2%
Dash Punctuation 12
 
< 0.1%
Close Punctuation 12
 
< 0.1%
Open Punctuation 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 11432
19.9%
e 10754
18.7%
o 10490
18.2%
m 10205
17.7%
h 10204
17.7%
a 1493
 
2.6%
i 941
 
1.6%
f 916
 
1.6%
x 279
 
0.5%
n 219
 
0.4%
Other values (10) 569
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
C 10204
89.6%
S 738
 
6.5%
F 179
 
1.6%
E 153
 
1.3%
I 100
 
0.9%
A 7
 
0.1%
O 6
 
0.1%
W 6
 
0.1%
Space Separator
ValueCountFrequency (%)
119
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 68895
99.8%
Common 155
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 11432
16.6%
e 10754
15.6%
o 10490
15.2%
m 10205
14.8%
C 10204
14.8%
h 10204
14.8%
a 1493
 
2.2%
i 941
 
1.4%
f 916
 
1.3%
S 738
 
1.1%
Other values (18) 1518
 
2.2%
Common
ValueCountFrequency (%)
119
76.8%
- 12
 
7.7%
) 12
 
7.7%
( 12
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 69050
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 11432
16.6%
e 10754
15.6%
o 10490
15.2%
m 10205
14.8%
C 10204
14.8%
h 10204
14.8%
a 1493
 
2.2%
i 941
 
1.4%
f 916
 
1.3%
S 738
 
1.1%
Other values (22) 1673
 
2.4%

traffic_source
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct40
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
(direct)
8648 
google
2174 
dfa
 
130
mail.google.com
 
60
sites.google.com
 
43
Other values (35)
 
231

Length

Max length25
Median length8
Mean length7.7078682
Min length3

Characters and Unicode

Total characters86991
Distinct characters30
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)0.1%

Sample

1st row(direct)
2nd row(direct)
3rd row(direct)
4th row(direct)
5th row(direct)

Common Values

ValueCountFrequency (%)
(direct) 8648
76.6%
google 2174
 
19.3%
dfa 130
 
1.2%
mail.google.com 60
 
0.5%
sites.google.com 43
 
0.4%
dealspotr.com 39
 
0.3%
groups.google.com 37
 
0.3%
yahoo 22
 
0.2%
bing 20
 
0.2%
facebook.com 14
 
0.1%
Other values (30) 99
 
0.9%

Length

2024-10-02T00:34:32.070491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
direct 8648
76.6%
google 2174
 
19.3%
dfa 130
 
1.2%
mail.google.com 60
 
0.5%
sites.google.com 43
 
0.4%
dealspotr.com 39
 
0.3%
groups.google.com 37
 
0.3%
yahoo 22
 
0.2%
bing 20
 
0.2%
facebook.com 14
 
0.1%
Other values (30) 99
 
0.9%

Most occurring characters

ValueCountFrequency (%)
e 11146
12.8%
c 8982
10.3%
d 8837
10.2%
i 8805
10.1%
t 8779
10.1%
r 8765
10.1%
( 8648
9.9%
) 8648
9.9%
o 5188
6.0%
g 4732
5.4%
Other values (20) 4461
5.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 69211
79.6%
Open Punctuation 8648
 
9.9%
Close Punctuation 8648
 
9.9%
Other Punctuation 473
 
0.5%
Uppercase Letter 9
 
< 0.1%
Dash Punctuation 1
 
< 0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11146
16.1%
c 8982
13.0%
d 8837
12.8%
i 8805
12.7%
t 8779
12.7%
r 8765
12.7%
o 5188
7.5%
g 4732
6.8%
l 2476
 
3.6%
m 355
 
0.5%
Other values (14) 1146
 
1.7%
Open Punctuation
ValueCountFrequency (%)
( 8648
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8648
100.0%
Other Punctuation
ValueCountFrequency (%)
. 473
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Decimal Number
ValueCountFrequency (%)
5 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 69220
79.6%
Common 17771
 
20.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11146
16.1%
c 8982
13.0%
d 8837
12.8%
i 8805
12.7%
t 8779
12.7%
r 8765
12.7%
o 5188
7.5%
g 4732
6.8%
l 2476
 
3.6%
m 355
 
0.5%
Other values (15) 1155
 
1.7%
Common
ValueCountFrequency (%)
( 8648
48.7%
) 8648
48.7%
. 473
 
2.7%
- 1
 
< 0.1%
5 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 86991
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 11146
12.8%
c 8982
10.3%
d 8837
10.2%
i 8805
10.1%
t 8779
10.1%
r 8765
10.1%
( 8648
9.9%
) 8648
9.9%
o 5188
6.0%
g 4732
5.4%
Other values (20) 4461
5.1%

country
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
United States
11286 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters146718
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowUnited States

Common Values

ValueCountFrequency (%)
United States 11286
100.0%

Length

2024-10-02T00:34:32.164818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-02T00:34:32.238822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
united 11286
50.0%
states 11286
50.0%

Most occurring characters

ValueCountFrequency (%)
t 33858
23.1%
e 22572
15.4%
U 11286
 
7.7%
n 11286
 
7.7%
i 11286
 
7.7%
d 11286
 
7.7%
11286
 
7.7%
S 11286
 
7.7%
a 11286
 
7.7%
s 11286
 
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 112860
76.9%
Uppercase Letter 22572
 
15.4%
Space Separator 11286
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 33858
30.0%
e 22572
20.0%
n 11286
 
10.0%
i 11286
 
10.0%
d 11286
 
10.0%
a 11286
 
10.0%
s 11286
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
U 11286
50.0%
S 11286
50.0%
Space Separator
ValueCountFrequency (%)
11286
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 135432
92.3%
Common 11286
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 33858
25.0%
e 22572
16.7%
U 11286
 
8.3%
n 11286
 
8.3%
i 11286
 
8.3%
d 11286
 
8.3%
S 11286
 
8.3%
a 11286
 
8.3%
s 11286
 
8.3%
Common
ValueCountFrequency (%)
11286
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 146718
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 33858
23.1%
e 22572
15.4%
U 11286
 
7.7%
n 11286
 
7.7%
i 11286
 
7.7%
d 11286
 
7.7%
11286
 
7.7%
S 11286
 
7.7%
a 11286
 
7.7%
s 11286
 
7.7%

cities
Text

Distinct2170
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
2024-10-02T00:34:32.390359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length24
Mean length9.2098175
Min length3

Characters and Unicode

Total characters103942
Distinct characters63
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique593 ?
Unique (%)5.3%

Sample

1st rowSpokane
2nd rowPooler
3rd rowChesterfield
4th rowChesterfield
5th rowMurfreesboro
ValueCountFrequency (%)
city 321
 
2.1%
west 206
 
1.4%
falls 201
 
1.3%
park 181
 
1.2%
south 162
 
1.1%
north 151
 
1.0%
valley 121
 
0.8%
eagle 121
 
0.8%
east 115
 
0.8%
saint 98
 
0.7%
Other values (1978) 13350
88.8%
2024-10-02T00:34:32.663030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 9578
 
9.2%
a 9176
 
8.8%
n 7636
 
7.3%
o 7273
 
7.0%
l 7009
 
6.7%
r 6659
 
6.4%
i 6115
 
5.9%
t 5752
 
5.5%
s 4696
 
4.5%
3741
 
3.6%
Other values (53) 36307
34.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 84902
81.7%
Uppercase Letter 15140
 
14.6%
Space Separator 3741
 
3.6%
Dash Punctuation 90
 
0.1%
Other Punctuation 60
 
0.1%
Decimal Number 3
 
< 0.1%
Initial Punctuation 2
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 9578
11.3%
a 9176
10.8%
n 7636
9.0%
o 7273
 
8.6%
l 7009
 
8.3%
r 6659
 
7.8%
i 6115
 
7.2%
t 5752
 
6.8%
s 4696
 
5.5%
d 2819
 
3.3%
Other values (19) 18189
21.4%
Uppercase Letter
ValueCountFrequency (%)
C 1590
 
10.5%
S 1380
 
9.1%
M 1271
 
8.4%
B 1227
 
8.1%
P 948
 
6.3%
W 898
 
5.9%
L 850
 
5.6%
F 834
 
5.5%
H 751
 
5.0%
A 702
 
4.6%
Other values (16) 4689
31.0%
Other Punctuation
ValueCountFrequency (%)
' 42
70.0%
. 18
30.0%
Space Separator
ValueCountFrequency (%)
3741
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 90
100.0%
Decimal Number
ValueCountFrequency (%)
1 3
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 100042
96.2%
Common 3900
 
3.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 9578
 
9.6%
a 9176
 
9.2%
n 7636
 
7.6%
o 7273
 
7.3%
l 7009
 
7.0%
r 6659
 
6.7%
i 6115
 
6.1%
t 5752
 
5.7%
s 4696
 
4.7%
d 2819
 
2.8%
Other values (45) 33329
33.3%
Common
ValueCountFrequency (%)
3741
95.9%
- 90
 
2.3%
' 42
 
1.1%
. 18
 
0.5%
1 3
 
0.1%
2
 
0.1%
( 2
 
0.1%
) 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 103932
> 99.9%
None 8
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 9578
 
9.2%
a 9176
 
8.8%
n 7636
 
7.3%
o 7273
 
7.0%
l 7009
 
6.7%
r 6659
 
6.4%
i 6115
 
5.9%
t 5752
 
5.5%
s 4696
 
4.5%
3741
 
3.6%
Other values (49) 36297
34.9%
None
ValueCountFrequency (%)
ñ 5
62.5%
ā 2
 
25.0%
ī 1
 
12.5%
Punctuation
ValueCountFrequency (%)
2
100.0%

region
Text

Distinct51
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
2024-10-02T00:34:32.779869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length12
Mean length8.4885699
Min length4

Characters and Unicode

Total characters95802
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWashington
2nd rowGeorgia
3rd rowMissouri
4th rowMissouri
5th rowTennessee
ValueCountFrequency (%)
new 1202
 
8.6%
dakota 632
 
4.5%
south 604
 
4.3%
california 485
 
3.5%
north 476
 
3.4%
carolina 448
 
3.2%
texas 438
 
3.1%
idaho 436
 
3.1%
washington 434
 
3.1%
pennsylvania 418
 
3.0%
Other values (45) 8419
60.2%
2024-10-02T00:34:33.006996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 12588
13.1%
o 8731
 
9.1%
i 8069
 
8.4%
n 7458
 
7.8%
e 6390
 
6.7%
s 5876
 
6.1%
r 5074
 
5.3%
t 3915
 
4.1%
l 3512
 
3.7%
h 3195
 
3.3%
Other values (36) 30994
32.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 79242
82.7%
Uppercase Letter 13854
 
14.5%
Space Separator 2706
 
2.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12588
15.9%
o 8731
11.0%
i 8069
10.2%
n 7458
9.4%
e 6390
8.1%
s 5876
7.4%
r 5074
 
6.4%
t 3915
 
4.9%
l 3512
 
4.4%
h 3195
 
4.0%
Other values (14) 14434
18.2%
Uppercase Letter
ValueCountFrequency (%)
N 2194
15.8%
M 1544
11.1%
C 1401
10.1%
I 1249
 
9.0%
D 859
 
6.2%
A 835
 
6.0%
W 709
 
5.1%
T 657
 
4.7%
S 604
 
4.4%
K 556
 
4.0%
Other values (11) 3246
23.4%
Space Separator
ValueCountFrequency (%)
2706
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 93096
97.2%
Common 2706
 
2.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 12588
13.5%
o 8731
 
9.4%
i 8069
 
8.7%
n 7458
 
8.0%
e 6390
 
6.9%
s 5876
 
6.3%
r 5074
 
5.5%
t 3915
 
4.2%
l 3512
 
3.8%
h 3195
 
3.4%
Other values (35) 28288
30.4%
Common
ValueCountFrequency (%)
2706
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 95802
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 12588
13.1%
o 8731
 
9.1%
i 8069
 
8.4%
n 7458
 
7.8%
e 6390
 
6.7%
s 5876
 
6.1%
r 5074
 
5.3%
t 3915
 
4.1%
l 3512
 
3.7%
h 3195
 
3.3%
Other values (36) 30994
32.4%

total_hits
Real number (ℝ)

HIGH CORRELATION 

Distinct204
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.454014
Minimum3
Maximum500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.3 KiB
2024-10-02T00:34:33.123237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile12
Q119
median29
Q345
95-th percentile89
Maximum500
Range497
Interquartile range (IQR)26

Descriptive statistics

Standard deviation34.417162
Coefficient of variation (CV)0.91891785
Kurtosis55.117436
Mean37.454014
Median Absolute Deviation (MAD)12
Skewness5.6419069
Sum422706
Variance1184.541
MonotonicityNot monotonic
2024-10-02T00:34:33.238353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18 363
 
3.2%
16 351
 
3.1%
15 342
 
3.0%
14 337
 
3.0%
19 336
 
3.0%
17 334
 
3.0%
20 315
 
2.8%
23 302
 
2.7%
13 301
 
2.7%
21 297
 
2.6%
Other values (194) 8008
71.0%
ValueCountFrequency (%)
3 3
 
< 0.1%
4 8
 
0.1%
5 12
 
0.1%
6 8
 
0.1%
7 18
 
0.2%
8 53
 
0.5%
9 91
 
0.8%
10 113
1.0%
11 185
1.6%
12 245
2.2%
ValueCountFrequency (%)
500 12
0.1%
471 1
 
< 0.1%
387 6
0.1%
386 6
0.1%
385 1
 
< 0.1%
382 1
 
< 0.1%
361 1
 
< 0.1%
331 1
 
< 0.1%
328 1
 
< 0.1%
311 1
 
< 0.1%

total_pageviews
Real number (ℝ)

HIGH CORRELATION 

Distinct153
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.290537
Minimum3
Maximum466
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.3 KiB
2024-10-02T00:34:33.348248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile11
Q116
median23
Q335
95-th percentile64
Maximum466
Range463
Interquartile range (IQR)19

Descriptive statistics

Standard deviation26.395584
Coefficient of variation (CV)0.90116421
Kurtosis104.4019
Mean29.290537
Median Absolute Deviation (MAD)8
Skewness7.8665297
Sum330573
Variance696.72683
MonotonicityNot monotonic
2024-10-02T00:34:33.462308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16 462
 
4.1%
15 453
 
4.0%
14 452
 
4.0%
13 437
 
3.9%
18 436
 
3.9%
17 431
 
3.8%
21 396
 
3.5%
20 388
 
3.4%
12 370
 
3.3%
22 357
 
3.2%
Other values (143) 7104
62.9%
ValueCountFrequency (%)
3 3
 
< 0.1%
4 8
 
0.1%
5 12
 
0.1%
6 8
 
0.1%
7 23
 
0.2%
8 81
 
0.7%
9 133
 
1.2%
10 242
2.1%
11 355
3.1%
12 370
3.3%
ValueCountFrequency (%)
466 12
0.1%
343 6
0.1%
341 6
0.1%
305 1
 
< 0.1%
270 1
 
< 0.1%
233 1
 
< 0.1%
232 1
 
< 0.1%
224 1
 
< 0.1%
208 1
 
< 0.1%
202 1
 
< 0.1%

total_time_on_site
Real number (ℝ)

HIGH CORRELATION 

Distinct2716
Distinct (%)24.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1077.0158
Minimum9
Maximum15047
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.3 KiB
2024-10-02T00:34:33.568490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile226
Q1460
median782
Q31374
95-th percentile2846.75
Maximum15047
Range15038
Interquartile range (IQR)914

Descriptive statistics

Standard deviation964.72145
Coefficient of variation (CV)0.89573568
Kurtosis16.999072
Mean1077.0158
Median Absolute Deviation (MAD)394
Skewness2.9828584
Sum12155200
Variance930687.47
MonotonicityNot monotonic
2024-10-02T00:34:33.679809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
611 19
 
0.2%
439 18
 
0.2%
360 18
 
0.2%
352 18
 
0.2%
356 18
 
0.2%
307 17
 
0.2%
830 17
 
0.2%
388 17
 
0.2%
500 17
 
0.2%
568 17
 
0.2%
Other values (2706) 11110
98.4%
ValueCountFrequency (%)
9 1
 
< 0.1%
34 1
 
< 0.1%
36 1
 
< 0.1%
56 2
 
< 0.1%
64 1
 
< 0.1%
77 1
 
< 0.1%
83 2
 
< 0.1%
95 1
 
< 0.1%
96 5
< 0.1%
97 3
< 0.1%
ValueCountFrequency (%)
15047 1
 
< 0.1%
12136 1
 
< 0.1%
11094 2
 
< 0.1%
9564 1
 
< 0.1%
9275 2
 
< 0.1%
8999 1
 
< 0.1%
8811 1
 
< 0.1%
8805 6
0.1%
8369 1
 
< 0.1%
7433 1
 
< 0.1%

visit_number
Real number (ℝ)

Distinct109
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3429027
Minimum1
Maximum315
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.3 KiB
2024-10-02T00:34:33.898516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile12
Maximum315
Range314
Interquartile range (IQR)3

Descriptive statistics

Standard deviation14.239769
Coefficient of variation (CV)3.2788598
Kurtosis281.90641
Mean4.3429027
Median Absolute Deviation (MAD)1
Skewness15.333745
Sum49014
Variance202.77102
MonotonicityNot monotonic
2024-10-02T00:34:34.008639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 4330
38.4%
2 2437
21.6%
3 1393
 
12.3%
4 876
 
7.8%
5 549
 
4.9%
6 345
 
3.1%
7 243
 
2.2%
8 179
 
1.6%
9 155
 
1.4%
10 129
 
1.1%
Other values (99) 650
 
5.8%
ValueCountFrequency (%)
1 4330
38.4%
2 2437
21.6%
3 1393
 
12.3%
4 876
 
7.8%
5 549
 
4.9%
6 345
 
3.1%
7 243
 
2.2%
8 179
 
1.6%
9 155
 
1.4%
10 129
 
1.1%
ValueCountFrequency (%)
315 1
 
< 0.1%
312 1
 
< 0.1%
305 1
 
< 0.1%
303 2
< 0.1%
300 1
 
< 0.1%
299 3
< 0.1%
296 1
 
< 0.1%
295 1
 
< 0.1%
293 1
 
< 0.1%
259 1
 
< 0.1%

kmeans_cluster
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
0
9818 
3
 
814
1
 
494
2
 
160

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters11286
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9818
87.0%
3 814
 
7.2%
1 494
 
4.4%
2 160
 
1.4%

Length

2024-10-02T00:34:34.108608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-02T00:34:34.188885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9818
87.0%
3 814
 
7.2%
1 494
 
4.4%
2 160
 
1.4%

Most occurring characters

ValueCountFrequency (%)
0 9818
87.0%
3 814
 
7.2%
1 494
 
4.4%
2 160
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11286
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9818
87.0%
3 814
 
7.2%
1 494
 
4.4%
2 160
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
Common 11286
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 9818
87.0%
3 814
 
7.2%
1 494
 
4.4%
2 160
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 9818
87.0%
3 814
 
7.2%
1 494
 
4.4%
2 160
 
1.4%

dbscan_cluster
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
-1
10430 
0
 
349
1
 
266
2
 
241

Length

Max length2
Median length2
Mean length1.9241538
Min length1

Characters and Unicode

Total characters21716
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-1
2nd row-1
3rd row-1
4th row-1
5th row-1

Common Values

ValueCountFrequency (%)
-1 10430
92.4%
0 349
 
3.1%
1 266
 
2.4%
2 241
 
2.1%

Length

2024-10-02T00:34:34.276247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-02T00:34:34.358044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 10696
94.8%
0 349
 
3.1%
2 241
 
2.1%

Most occurring characters

ValueCountFrequency (%)
1 10696
49.3%
- 10430
48.0%
0 349
 
1.6%
2 241
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11286
52.0%
Dash Punctuation 10430
48.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 10696
94.8%
0 349
 
3.1%
2 241
 
2.1%
Dash Punctuation
ValueCountFrequency (%)
- 10430
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21716
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 10696
49.3%
- 10430
48.0%
0 349
 
1.6%
2 241
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21716
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 10696
49.3%
- 10430
48.0%
0 349
 
1.6%
2 241
 
1.1%

agg_cluster
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
2
9078 
0
1240 
1
 
814
3
 
154

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters11286
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 9078
80.4%
0 1240
 
11.0%
1 814
 
7.2%
3 154
 
1.4%

Length

2024-10-02T00:34:34.449117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-02T00:34:34.526868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 9078
80.4%
0 1240
 
11.0%
1 814
 
7.2%
3 154
 
1.4%

Most occurring characters

ValueCountFrequency (%)
2 9078
80.4%
0 1240
 
11.0%
1 814
 
7.2%
3 154
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11286
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 9078
80.4%
0 1240
 
11.0%
1 814
 
7.2%
3 154
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
Common 11286
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 9078
80.4%
0 1240
 
11.0%
1 814
 
7.2%
3 154
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 9078
80.4%
0 1240
 
11.0%
1 814
 
7.2%
3 154
 
1.4%

device_category
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size88.3 KiB
desktop
10312 
mobile
 
814
tablet
 
160

Length

Max length7
Median length7
Mean length6.9136984
Min length6

Characters and Unicode

Total characters78028
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdesktop
2nd rowdesktop
3rd rowdesktop
4th rowdesktop
5th rowdesktop

Common Values

ValueCountFrequency (%)
desktop 10312
91.4%
mobile 814
 
7.2%
tablet 160
 
1.4%

Length

2024-10-02T00:34:34.620339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-02T00:34:34.698914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
desktop 10312
91.4%
mobile 814
 
7.2%
tablet 160
 
1.4%

Most occurring characters

ValueCountFrequency (%)
e 11286
14.5%
o 11126
14.3%
t 10632
13.6%
d 10312
13.2%
s 10312
13.2%
k 10312
13.2%
p 10312
13.2%
b 974
 
1.2%
l 974
 
1.2%
m 814
 
1.0%
Other values (2) 974
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 78028
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11286
14.5%
o 11126
14.3%
t 10632
13.6%
d 10312
13.2%
s 10312
13.2%
k 10312
13.2%
p 10312
13.2%
b 974
 
1.2%
l 974
 
1.2%
m 814
 
1.0%
Other values (2) 974
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 78028
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11286
14.5%
o 11126
14.3%
t 10632
13.6%
d 10312
13.2%
s 10312
13.2%
k 10312
13.2%
p 10312
13.2%
b 974
 
1.2%
l 974
 
1.2%
m 814
 
1.0%
Other values (2) 974
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78028
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 11286
14.5%
o 11126
14.3%
t 10632
13.6%
d 10312
13.2%
s 10312
13.2%
k 10312
13.2%
p 10312
13.2%
b 974
 
1.2%
l 974
 
1.2%
m 814
 
1.0%
Other values (2) 974
 
1.2%

products
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.0565302
Minimum1
Maximum35
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size88.3 KiB
2024-10-02T00:34:34.781575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile9
Maximum35
Range34
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.9844663
Coefficient of variation (CV)0.97642295
Kurtosis13.00007
Mean3.0565302
Median Absolute Deviation (MAD)1
Skewness2.9327305
Sum34496
Variance8.9070389
MonotonicityNot monotonic
2024-10-02T00:34:34.886342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
1 4203
37.2%
2 2438
21.6%
3 1458
 
12.9%
4 1019
 
9.0%
5 641
 
5.7%
6 444
 
3.9%
7 278
 
2.5%
8 183
 
1.6%
9 153
 
1.4%
10 127
 
1.1%
Other values (21) 342
 
3.0%
ValueCountFrequency (%)
1 4203
37.2%
2 2438
21.6%
3 1458
 
12.9%
4 1019
 
9.0%
5 641
 
5.7%
6 444
 
3.9%
7 278
 
2.5%
8 183
 
1.6%
9 153
 
1.4%
10 127
 
1.1%
ValueCountFrequency (%)
35 1
 
< 0.1%
31 1
 
< 0.1%
30 1
 
< 0.1%
29 1
 
< 0.1%
28 3
 
< 0.1%
26 1
 
< 0.1%
25 1
 
< 0.1%
24 4
 
< 0.1%
23 4
 
< 0.1%
22 15
0.1%

Interactions

2024-10-02T00:34:29.868123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:26.388303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:26.995870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.580105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.148311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.700501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.296647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.951587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:26.474775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.078570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.664739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.229350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.785917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.378547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:30.035906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:26.554863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.158295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.745815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.310367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.870808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.459008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:30.114877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:26.634059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.237959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.825933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.384861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.954011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.538393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:30.292190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:26.710800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.317183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.900023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.458323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.036210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.614875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:30.373564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:26.793819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.407468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.982840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.541347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.121606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.703091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:30.454031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:26.912748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:27.491731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.061234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:28.620875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.210088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-10-02T00:34:29.783868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-10-02T00:34:34.975852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
visitor_idtotal_transaction_revenuetotal_hitstotal_pageviewstotal_time_on_sitevisit_numberproductschannel_groupingbrowsertraffic_sourcekmeans_clusterdbscan_clusteragg_clusterdevice_category
visitor_id1.0000.0040.0140.0180.025-0.0080.0110.0310.0230.0340.0170.0000.0180.020
total_transaction_revenue0.0041.0000.3020.2760.2020.2230.5170.0610.0570.0350.0350.0000.0310.000
total_hits0.0140.3021.0000.9810.661-0.0090.5610.0640.0180.1580.4510.0730.3770.089
total_pageviews0.0180.2760.9811.0000.691-0.0300.5290.0720.0000.1780.4050.0500.3650.096
total_time_on_site0.0250.2020.6610.6911.000-0.0310.3450.0800.0450.2030.3780.0810.3240.109
visit_number-0.0080.223-0.009-0.030-0.0311.0000.1270.1430.1350.1300.0270.0000.0240.000
products0.0110.5170.5610.5290.3450.1271.0000.0260.0090.0000.1840.0460.2010.044
channel_grouping0.0310.0610.0640.0720.0800.1430.0261.0000.1380.6930.1760.0310.1730.213
browser0.0230.0570.0180.0000.0450.1350.0090.1381.0000.1630.2740.0180.2760.335
traffic_source0.0340.0350.1580.1780.2030.1300.0000.6930.1631.0000.1380.0030.1090.171
kmeans_cluster0.0170.0350.4510.4050.3780.0270.1840.1760.2740.1381.0000.0620.8801.000
dbscan_cluster0.0000.0000.0730.0500.0810.0000.0460.0310.0180.0030.0621.0000.0800.060
agg_cluster0.0180.0310.3770.3650.3240.0240.2010.1730.2760.1090.8800.0801.0000.990
device_category0.0200.0000.0890.0960.1090.0000.0440.2130.3350.1711.0000.0600.9901.000

Missing values

2024-10-02T00:34:30.583885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-10-02T00:34:30.772984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

visitor_idvisit_datetotal_transaction_revenuechannel_groupingbrowsertraffic_sourcecountrycitiesregiontotal_hitstotal_pageviewstotal_time_on_sitevisit_numberkmeans_clusterdbscan_clusteragg_clusterdevice_categoryproducts
02131311426489412017-04-2839590000DirectChrome(direct)United StatesSpokaneWashington141327210-12desktop1
14353240613398692016-10-2146790000ReferralChrome(direct)United StatesPoolerGeorgia141162720-12desktop1
25626781470427352017-04-2462610000ReferralChrome(direct)United StatesChesterfieldMissouri181631920-12desktop1
35626781470427352017-04-2497700000ReferralChrome(direct)United StatesChesterfieldMissouri181631920-12desktop1
45857088960498922016-12-2145970000ReferralChrome(direct)United StatesMurfreesboroTennessee222063410-12desktop3
58528012637803222017-06-2780000000DirectChrome(direct)United StatesMorris HeightsNew York28226751002desktop3
611235280560364042016-12-1298960000DirectChrome(direct)United StatesPortlandTexas353167710-12desktop4
719051185763594872016-10-1722190000ReferralChrome(direct)United StatesWinston-SalemNorth Carolina2625158850-12desktop2
825275281491766012016-12-0217990000Organic SearchSafari(direct)United StatesLowellMassachusetts151334813-11mobile1
927098345831385812016-12-169990000DirectFirefox(direct)United StatesCasperWyoming151561410-12desktop1
visitor_idvisit_datetotal_transaction_revenuechannel_groupingbrowsertraffic_sourcecountrycitiesregiontotal_hitstotal_pageviewstotal_time_on_sitevisit_numberkmeans_clusterdbscan_clusteragg_clusterdevice_categoryproducts
1127699892560273899847682016-10-3068980000Organic SearchChromegoogleUnited StatesAltadenaCalifornia3626126223-11mobile3
1127799897959842168709122017-02-06210720000ReferralChrome(direct)United StatesMaryvaleArizona7454110160-12desktop6
1127899901836173594214402017-03-3026380000Organic SearchChromegoogleUnited StatesLongviewWashington201663510-12desktop2
1127999901836173594214402017-04-27133120000Organic SearchChromegoogleUnited StatesLongviewWashington2520181460-12desktop3
1128099907971968963461122017-04-2147250000Organic SearchChromegoogleUnited StatesFountain HillsArizona402675610-12desktop3
1128199916333760501145602017-02-1835590000SocialChromeplus.google.comUnited StatesNewtonKansas171638610-12desktop1
1128299947670732130365442016-08-09140320000Organic SearchChromegoogleUnited StatesOmahaNebraska423075560-12desktop6
1128399974092469626777602016-12-0940360000ReferralChrome(direct)United StatesLafayetteIndiana8665142320-10desktop5
1128499985973220985876482016-08-01102200000DirectChrome(direct)United StatesNewburgKentucky3733204110-12desktop2
1128599989960030432296962016-11-1766980000Organic SearchChrome(direct)United StatesRacineWisconsin161641210-12desktop2